Overview

Dataset statistics

Number of variables9
Number of observations768
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory54.1 KiB
Average record size in memory72.2 B

Variable types

Numeric8
Categorical1

Warnings

Pregnancies has 111 (14.5%) zeros Zeros
BloodPressure has 35 (4.6%) zeros Zeros
SkinThickness has 227 (29.6%) zeros Zeros
Insulin has 374 (48.7%) zeros Zeros
BMI has 11 (1.4%) zeros Zeros

Reproduction

Analysis started2021-01-29 12:33:10.418018
Analysis finished2021-01-29 12:33:25.952579
Duration15.53 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

Pregnancies
Real number (ℝ≥0)

ZEROS

Distinct17
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.845052083
Minimum0
Maximum17
Zeros111
Zeros (%)14.5%
Memory size6.1 KiB
2021-01-29T12:33:26.075288image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q36
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.369578063
Coefficient of variation (CV)0.8763413316
Kurtosis0.1592197775
Mean3.845052083
Median Absolute Deviation (MAD)2
Skewness0.9016739792
Sum2953
Variance11.35405632
MonotocityNot monotonic
2021-01-29T12:33:26.215668image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1135
17.6%
0111
14.5%
2103
13.4%
375
9.8%
468
8.9%
557
7.4%
650
 
6.5%
745
 
5.9%
838
 
4.9%
928
 
3.6%
Other values (7)58
7.6%
ValueCountFrequency (%)
0111
14.5%
1135
17.6%
2103
13.4%
375
9.8%
468
8.9%
557
7.4%
650
 
6.5%
745
 
5.9%
838
 
4.9%
928
 
3.6%
ValueCountFrequency (%)
171
 
0.1%
151
 
0.1%
142
 
0.3%
1310
 
1.3%
129
 
1.2%
1111
 
1.4%
1024
3.1%
928
3.6%
838
4.9%
745
5.9%

Glucose
Real number (ℝ≥0)

Distinct136
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120.8945312
Minimum0
Maximum199
Zeros5
Zeros (%)0.7%
Memory size6.1 KiB
2021-01-29T12:33:26.349260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile79
Q199
median117
Q3140.25
95-th percentile181
Maximum199
Range199
Interquartile range (IQR)41.25

Descriptive statistics

Standard deviation31.9726182
Coefficient of variation (CV)0.2644670347
Kurtosis0.6407798204
Mean120.8945312
Median Absolute Deviation (MAD)20
Skewness0.1737535018
Sum92847
Variance1022.248314
MonotocityNot monotonic
2021-01-29T12:33:26.480803image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10017
 
2.2%
9917
 
2.2%
12914
 
1.8%
12514
 
1.8%
11114
 
1.8%
10614
 
1.8%
9513
 
1.7%
10813
 
1.7%
10513
 
1.7%
10213
 
1.7%
Other values (126)626
81.5%
ValueCountFrequency (%)
05
0.7%
441
 
0.1%
561
 
0.1%
572
 
0.3%
611
 
0.1%
621
 
0.1%
651
 
0.1%
671
 
0.1%
683
0.4%
714
0.5%
ValueCountFrequency (%)
1991
 
0.1%
1981
 
0.1%
1974
0.5%
1963
0.4%
1952
0.3%
1943
0.4%
1932
0.3%
1911
 
0.1%
1901
 
0.1%
1894
0.5%

BloodPressure
Real number (ℝ≥0)

ZEROS

Distinct47
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.10546875
Minimum0
Maximum122
Zeros35
Zeros (%)4.6%
Memory size6.1 KiB
2021-01-29T12:33:26.596729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.7
Q162
median72
Q380
95-th percentile90
Maximum122
Range122
Interquartile range (IQR)18

Descriptive statistics

Standard deviation19.35580717
Coefficient of variation (CV)0.2800908166
Kurtosis5.18015656
Mean69.10546875
Median Absolute Deviation (MAD)8
Skewness-1.843607983
Sum53073
Variance374.6472712
MonotocityNot monotonic
2021-01-29T12:33:26.734719image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
7057
 
7.4%
7452
 
6.8%
6845
 
5.9%
7845
 
5.9%
7244
 
5.7%
6443
 
5.6%
8040
 
5.2%
7639
 
5.1%
6037
 
4.8%
035
 
4.6%
Other values (37)331
43.1%
ValueCountFrequency (%)
035
4.6%
241
 
0.1%
302
 
0.3%
381
 
0.1%
401
 
0.1%
444
 
0.5%
462
 
0.3%
485
 
0.7%
5013
 
1.7%
5211
 
1.4%
ValueCountFrequency (%)
1221
 
0.1%
1141
 
0.1%
1103
0.4%
1082
0.3%
1063
0.4%
1042
0.3%
1021
 
0.1%
1003
0.4%
983
0.4%
964
0.5%

SkinThickness
Real number (ℝ≥0)

ZEROS

Distinct51
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.53645833
Minimum0
Maximum99
Zeros227
Zeros (%)29.6%
Memory size6.1 KiB
2021-01-29T12:33:26.850586image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median23
Q332
95-th percentile44
Maximum99
Range99
Interquartile range (IQR)32

Descriptive statistics

Standard deviation15.95221757
Coefficient of variation (CV)0.776775494
Kurtosis-0.5200718662
Mean20.53645833
Median Absolute Deviation (MAD)12
Skewness0.1093724965
Sum15772
Variance254.4732453
MonotocityNot monotonic
2021-01-29T12:33:26.967582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0227
29.6%
3231
 
4.0%
3027
 
3.5%
2723
 
3.0%
2322
 
2.9%
3320
 
2.6%
1820
 
2.6%
2820
 
2.6%
3119
 
2.5%
3918
 
2.3%
Other values (41)341
44.4%
ValueCountFrequency (%)
0227
29.6%
72
 
0.3%
82
 
0.3%
105
 
0.7%
116
 
0.8%
127
 
0.9%
1311
 
1.4%
146
 
0.8%
1514
 
1.8%
166
 
0.8%
ValueCountFrequency (%)
991
 
0.1%
631
 
0.1%
601
 
0.1%
561
 
0.1%
542
0.3%
522
0.3%
511
 
0.1%
503
0.4%
493
0.4%
484
0.5%

Insulin
Real number (ℝ≥0)

ZEROS

Distinct186
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.79947917
Minimum0
Maximum846
Zeros374
Zeros (%)48.7%
Memory size6.1 KiB
2021-01-29T12:33:27.098741image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median30.5
Q3127.25
95-th percentile293
Maximum846
Range846
Interquartile range (IQR)127.25

Descriptive statistics

Standard deviation115.2440024
Coefficient of variation (CV)1.444169856
Kurtosis7.214259554
Mean79.79947917
Median Absolute Deviation (MAD)30.5
Skewness2.272250858
Sum61286
Variance13281.18008
MonotocityNot monotonic
2021-01-29T12:33:27.273841image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0374
48.7%
10511
 
1.4%
1409
 
1.2%
1309
 
1.2%
1208
 
1.0%
1007
 
0.9%
947
 
0.9%
1807
 
0.9%
1106
 
0.8%
1156
 
0.8%
Other values (176)324
42.2%
ValueCountFrequency (%)
0374
48.7%
141
 
0.1%
151
 
0.1%
161
 
0.1%
182
 
0.3%
221
 
0.1%
232
 
0.3%
251
 
0.1%
291
 
0.1%
321
 
0.1%
ValueCountFrequency (%)
8461
0.1%
7441
0.1%
6801
0.1%
6001
0.1%
5791
0.1%
5451
0.1%
5431
0.1%
5401
0.1%
5101
0.1%
4952
0.3%

BMI
Real number (ℝ≥0)

ZEROS

Distinct248
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.99257812
Minimum0
Maximum67.1
Zeros11
Zeros (%)1.4%
Memory size6.1 KiB
2021-01-29T12:33:27.467441image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21.8
Q127.3
median32
Q336.6
95-th percentile44.395
Maximum67.1
Range67.1
Interquartile range (IQR)9.3

Descriptive statistics

Standard deviation7.88416032
Coefficient of variation (CV)0.2464371671
Kurtosis3.290442901
Mean31.99257812
Median Absolute Deviation (MAD)4.6
Skewness-0.4289815885
Sum24570.3
Variance62.15998396
MonotocityNot monotonic
2021-01-29T12:33:27.802269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3213
 
1.7%
31.612
 
1.6%
31.212
 
1.6%
011
 
1.4%
33.310
 
1.3%
32.410
 
1.3%
32.89
 
1.2%
30.89
 
1.2%
32.99
 
1.2%
30.19
 
1.2%
Other values (238)664
86.5%
ValueCountFrequency (%)
011
1.4%
18.23
 
0.4%
18.41
 
0.1%
19.11
 
0.1%
19.31
 
0.1%
19.41
 
0.1%
19.52
 
0.3%
19.63
 
0.4%
19.91
 
0.1%
201
 
0.1%
ValueCountFrequency (%)
67.11
0.1%
59.41
0.1%
57.31
0.1%
551
0.1%
53.21
0.1%
52.91
0.1%
52.32
0.3%
501
0.1%
49.71
0.1%
49.61
0.1%

DiabetesPedigreeFunction
Real number (ℝ≥0)

Distinct517
Distinct (%)67.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4718763021
Minimum0.078
Maximum2.42
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2021-01-29T12:33:27.948567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.078
5-th percentile0.14035
Q10.24375
median0.3725
Q30.62625
95-th percentile1.13285
Maximum2.42
Range2.342
Interquartile range (IQR)0.3825

Descriptive statistics

Standard deviation0.331328595
Coefficient of variation (CV)0.7021513764
Kurtosis5.594953528
Mean0.4718763021
Median Absolute Deviation (MAD)0.1675
Skewness1.919911066
Sum362.401
Variance0.1097786379
MonotocityNot monotonic
2021-01-29T12:33:28.079774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.2546
 
0.8%
0.2586
 
0.8%
0.2595
 
0.7%
0.2385
 
0.7%
0.2075
 
0.7%
0.2685
 
0.7%
0.2615
 
0.7%
0.1674
 
0.5%
0.194
 
0.5%
0.274
 
0.5%
Other values (507)719
93.6%
ValueCountFrequency (%)
0.0781
0.1%
0.0841
0.1%
0.0852
0.3%
0.0882
0.3%
0.0891
0.1%
0.0921
0.1%
0.0961
0.1%
0.11
0.1%
0.1011
0.1%
0.1021
0.1%
ValueCountFrequency (%)
2.421
0.1%
2.3291
0.1%
2.2881
0.1%
2.1371
0.1%
1.8931
0.1%
1.7811
0.1%
1.7311
0.1%
1.6991
0.1%
1.6981
0.1%
1.61
0.1%

Age
Real number (ℝ≥0)

Distinct52
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.24088542
Minimum21
Maximum81
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2021-01-29T12:33:28.257164image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum81
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.76023154
Coefficient of variation (CV)0.3537881556
Kurtosis0.6431588885
Mean33.24088542
Median Absolute Deviation (MAD)7
Skewness1.129596701
Sum25529
Variance138.3030459
MonotocityNot monotonic
2021-01-29T12:33:28.422650image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2272
 
9.4%
2163
 
8.2%
2548
 
6.2%
2446
 
6.0%
2338
 
4.9%
2835
 
4.6%
2633
 
4.3%
2732
 
4.2%
2929
 
3.8%
3124
 
3.1%
Other values (42)348
45.3%
ValueCountFrequency (%)
2163
8.2%
2272
9.4%
2338
4.9%
2446
6.0%
2548
6.2%
2633
4.3%
2732
4.2%
2835
4.6%
2929
3.8%
3021
 
2.7%
ValueCountFrequency (%)
811
 
0.1%
721
 
0.1%
701
 
0.1%
692
0.3%
681
 
0.1%
673
0.4%
664
0.5%
653
0.4%
641
 
0.1%
634
0.5%

Outcome
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.6 KiB
0
500 
1
268 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1
ValueCountFrequency (%)
0500
65.1%
1268
34.9%
2021-01-29T12:33:28.643352image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-01-29T12:33:28.713895image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0500
65.1%
1268
34.9%

Most occurring characters

ValueCountFrequency (%)
0500
65.1%
1268
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number768
100.0%

Most frequent character per category

ValueCountFrequency (%)
0500
65.1%
1268
34.9%

Most occurring scripts

ValueCountFrequency (%)
Common768
100.0%

Most frequent character per script

ValueCountFrequency (%)
0500
65.1%
1268
34.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII768
100.0%

Most frequent character per block

ValueCountFrequency (%)
0500
65.1%
1268
34.9%

Interactions

2021-01-29T12:33:16.836183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:17.033653image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:17.315900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:17.550276image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:17.770686image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.004091image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.207690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.335349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.457649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.579324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.698101image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:18.859607image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:19.077059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:19.221055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:19.432416image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:19.727033image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:19.928491image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.052547image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.178210image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.305835image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.418603image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.549253image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.667935image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.823388image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:20.998885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.148987image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.301311image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.446615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.582445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.701428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.812639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:21.947282image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.088483image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.209124image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.333822image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.457461image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.617065image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.773239image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:22.923923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:23.060558image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:23.211467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:23.373068image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:23.519678image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:23.672272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:23.815888image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.058824image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.200500image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.321949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.441147image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.554104image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.705554image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:24.865849image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:25.007025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:25.138070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:25.269404image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-01-29T12:33:25.390537image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-01-29T12:33:28.784660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-01-29T12:33:28.988312image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-01-29T12:33:29.331428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-01-29T12:33:29.504966image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-01-29T12:33:25.635782image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-01-29T12:33:25.831323image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
061487235033.60.627501
11856629026.60.351310
28183640023.30.672321
318966239428.10.167210
40137403516843.12.288331
55116740025.60.201300
637850328831.00.248261
71011500035.30.134290
82197704554330.50.158531
9812596000.00.232541

Last rows

PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
7581106760037.50.197260
7596190920035.50.278661
76028858261628.40.766220
76191707431044.00.403431
762989620022.50.142330
76310101764818032.90.171630
76421227027036.80.340270
7655121722311226.20.245300
7661126600030.10.349471
7671937031030.40.315230